A novel feature extraction method based on highly expressed SNPs for tissue-specific gene prediction

نویسندگان

چکیده

Abstract Background Gene expression provides a means for an organism to produce gene products necessary the live. Variation in significant levels can distinguish and tissue which is expressed. Tissue-specific expression, often determined by single nucleotide polymorphisms (SNPs), potential molecular markers or therapeutic targets disease progression. Therefore, SNPs are good candidates identifying The current bioinformatics literature uses network modeling summarize complex interactions between transcription factors, genes, products. Here, our focus on SNPs’ impact tissue-specific levels. To best of knowledge, we not aware any studies that genes using SNP Method We propose novel feature extraction method based highly expressed k-mers as features. also optimal k-mer sizes used approach. Determining still open research question it depends dataset purpose analysis. evaluate algorithm’s performance range multinomial naive Bayes (MNB) classifier 49 human tissues from Genotype-Tissue Expression (GTEx) portal. Conclusions Our approach achieves practical results with size 3. Based analysis number under study, [7, 8, 9] [8, 9, 10] typically machine learning model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multivariate Feature Extraction for Prediction of Future Gene Expression Profile

Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...

متن کامل

Multivariate Feature Extraction for Prediction of Future Gene Expression Profile

Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...

متن کامل

A Novel Fuzzy Based Method for Heart Rate Variability Prediction

Abstract In this paper, a novel technique based on fuzzy method is presented for chaotic nonlinear time series prediction. Fuzzy approach with the gradient learning algorithm and methods constitutes the main components of this method. This learning process in this method is similar to conventional gradient descent learning process, except that the input patterns and parameters are stored in mem...

متن کامل

A Fast Localization and Feature Extraction Method Based on Wavelet Transform in Iris Recognition

With an increasing emphasis on security, automated personal identification based on biometrics has been receiving extensive attention. Iris recognition, as an emerging biometric recognition approach, is becoming a very active topic in both research and practical applications. In general, a typical iris recognition system includes iris imaging, iris liveness detection, and recognition. This rese...

متن کامل

A method for speeding up feature extraction based on KPCA

Kernel principal component analysis (KPCA) extracts features of samples with an efficiency in inverse proportion to the size of the training sample set. In this paper, we develop a novel method to improve KPCA-based feature extraction. The developed method is the first one that is methodologically consistent with KPCA. Experiments on several benchmark datasets illustrate that the feature extrac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Big Data

سال: 2021

ISSN: ['2196-1115']

DOI: https://doi.org/10.1186/s40537-021-00497-9